gh-146302: make Py_IsInitialized() thread-safe and reflect true init completion#146303
gh-146302: make Py_IsInitialized() thread-safe and reflect true init completion#146303gpshead wants to merge 7 commits intopython:mainfrom
Conversation
… init completion Py_IsInitialized() previously returned 1 before Py_InitializeEx() had finished, because the runtime flag was set before site.py was imported. This caused a race in embedders like PyO3 where a second thread could observe an initialized interpreter before sys.path was fully configured. Move the initialized=1 store to the end of init_interp_main(), after site import, lazy imports, tier 2 optimizer, and dict watcher setup. Access both `initialized` and `core_initialized` through new inline accessors that use acquire/release atomics, eliminating the C-standard data race and ensuring correct visibility on weakly-ordered architectures. Fix PySys_AddAuditHook() to check core_initialized (not initialized) when deciding whether a thread state is available, since tstate exists after core init completes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| .. versionchanged:: next | ||
| This function no longer returns true until initialization has fully | ||
| completed, including import of the :mod:`site` module. Previously it | ||
| could return true while :c:func:`Py_Initialize` was still running. |
There was a problem hiding this comment.
this is technically a bug fix, we don't always call these out in the main docs. i'm happy dropping this, but it feels worthwhile to mention for embedders likely to use this API. and if we choose to backport this as a bugfix it'll include proper 3.14.x and 3.13.x attribution to indicate when people need to work around it.
There was a problem hiding this comment.
It is fine to add versionchanged when an important behavior needed to be pointed out in a bugfix release so I don't mind keeping this here.
There was a problem hiding this comment.
However considering what it changed I think it is best to keep it in 3.15 only.
There was a problem hiding this comment.
Agreed, I'm fine not backporting this. It's rare enough and PyO3 already has a workaround.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| } | ||
| Py_XDECREF(warnoptions); | ||
|
|
||
| interp->runtime->initialized = 1; |
There was a problem hiding this comment.
Moving this down after site and the rest of the machinery is the part I'm most interested in seeing tested. Intuitively I would not expect any code site may import to ever care about Py_IsInitialized() itself as extension module code "should" never have a reason to use that API as it shouldn't be called outside of an interpreter.
But this is the one thing in this change that might be an observable behavior change. Even if we don't agree with someone relying on the existing behavior.
The docs say audit hooks are notified "after runtime initialization". With the flag move, initialized=0 throughout init (including site import), so hooks are correctly not invoked during any init phase. Using core_initialized would have expanded the invocation window beyond the original behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use a C audit hook registered before Py_Initialize() to observe the value of Py_IsInitialized() when the "import" event fires for "site". With the old flag position this would have returned 1; after pythongh-146302 it correctly returns 0. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1cf04e0 to
dafd62e
Compare
Use _Py_IsCoreInitialized() in preconfig.c and Py_IsInitialized() in Py_InitializeEx(), removing the unnecessary runtime local variable. Thanks picnixz!
dafd62e to
ba28b37
Compare
picnixz
left a comment
There was a problem hiding this comment.
OOC there are lots of boolean flags that are still bare ints. Are they all protected against races or?
probably not technically... I doubt the data race potential honestly exposes itself much in this area in practice as filling these in happens so much sooner than most anything that could be consuming them that all memory writes long since land. Py_IsInitialized() being a reader intended for concurrent access feels like an uncommon case. But I didn't try to analyze the others. asking Opus 4.6 to analyze their use for races (my summary): [which makes sense to me] |
|
🤖 New build scheduled with the buildbot fleet by @gpshead for commit abae231 🤖 Results will be shown at: https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F146303%2Fmerge If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again. |
|
Thanks for doing this for PyO3's sake. Does thread-safe here mean that PyO3 could in principle freely call (I appreciate what PyO3 is doing is slightly weird, because there's no way to reliably know which one of those threads becomes the "main thread". It's good enough for Rust test suites, at least!) |
|
Sorry, I see I misread the title - this is about |
No, CPython does not have a block until initialization completes API. That might make sense as a feature request on its own but I'd wonder what the practical motivation for it was and why we'd want people to do it. |
|
I think it's fine to leave that synchronization downstream in PyO3 rather than complicate the C API. I think in most embedding contexts you want to be deliberate about interpreter lifecycle; what PyO3 does with the concurrent initialization is a quirk of Rust test suites launching multiple threads with no particular need for any proper finalization. |
|
Agreed. We currently have a variety of C API surfaces (like Initialize and IsInterpreter) that are easier when embedding processes take a wide "hey i'm going to have a Python interpreter, let me go ahead and set that up once up front instead of having everything that wants to use it try to do that at time of first use" approach. That is not really practical for all situations. Your Rust example of using one from some test cases is a good example. There are others, but they might expand into a larger C API and one interpreter vs multiple interpreters vs subinterpreters question. Imagine two independent libraries linked into a process unaware of each-other both wanting to use an embedded Python interpreter for internal use. As independent code, they have no good way to coordinate in a multithreaded context where they might both be trying at once. Which still ignores the question of if they need isolated interpreters or not. All of that could turn into a discuss.python.org thread... |
| } | ||
| } | ||
|
|
||
| // Release on store, acquire on load: a thread that reads initialized=1 |
There was a problem hiding this comment.
This code is not performance critical, it is better to use seq-cst here and not worry about reordering issues.
Summary
runtime->initialized = 1store from beforesite.pyimport to the end ofinit_interp_main(), soPy_IsInitialized()only returns true after initialization has fully completedinitializedandcore_initializedthrough new inline accessors using acquire/release atomics, to also protect from data race undefined behaviorPySys_AddAuditHook()now uses the accessor, so with the flag move it correctly skips audit hook invocation during all init phases (matching the documented "after runtime initialization" behavior) ... We could argue that running these earlier would be good even if the intent was never explicitly expressed, but that'd be its own issue.Motivation
Py_IsInitialized()returned 1 whilePy_InitializeEx()was still running — specifically, beforesite.pyhad been imported. See PyO3/pyo3#5900 where a second thread could acquire the GIL and start executing Python with an incompletesys.pathbecausesite.pyhadn't finished.The flag was also a plain
intwith no atomic operations, making concurrent reads a C-standard data race, though unlikely to manifest.Regression test:
The added test properly fails on
mainwithERROR: Py_IsInitialized() was true during site import.Py_IsInitialized()can return 1 beforePy_InitializeEx()has completed. #146302Fixes #146302
📚 Documentation preview 📚: https://cpython-previews--146303.org.readthedocs.build/